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Abstract. The constraint of difference is known to the constraint pro- 
gramming community since Lauriere introduced Alice jllj in 1978. Since 
then, several solving strategies have been designed for this constraint. In 
this paper we give both a practical overview and an abstract comparison 
of these different strategies. 



1 Introduction 

Many problems from combinatorial optimization can be modeled and solved us- 
ing techniques from Constraint Programming [p^j2^| . One of the constraints that 
arises naturally in these models is the alldifferent constraint, which states 
that all variables in this constraint must be pairwise different. In Example [l], a 
scheduling problem is modeled using the alldifferent constraint. 

Example 1 (Scheduling of speeches). Consider the following simple scheduling 
problem, adapted from Puget Q , where a set of speeches must be scheduled 
during one day. Each speech lasts exactly one hour (including questions and 
a coffee break), and only one conference room is available. Furthermore, each 
speaker has other commitments, and is available only for a limited fraction of 
the day. A particular instance of this problem is given in Table [j], where the 
fractions are defined by an earliest and latest possible time slot. This problem 



Table 1. Time slots for the speakers 



Speaker 


Earliest 


Latest 


Sebastian 


3 


6 


Frederic 


3 


4 


Jan-Georg 


2 


5 


Krzysztof 


2 


4 


Maarten 


3 


4 


Luca 


1 


6 



can be modeled as follows. We create one variable per speaker, whose value 
will be the period of his speech. The initial domains of the variables will be the 



available time intervals as stated in Table |J Since two speeches cannot be held at 
the same time in the same conference room, the period for two different speakers 
must be different. The constraints for this scheduling problem thus become: 

Xl G [3,6],a; 2 e [3,4], x 3 g [2,5], 
Xi G [2,4], x 5 £ [3,4],x 6 G [1,6], 
alldif f erent (xi, x%, x§, X4, x§, Xg). 

To find a solution to a model as in the previous example, a constraint solver 
essentially builds a search tree from all possible variable values. In general, find- 
ing a solution for such problems is ./VP-complete, and this search tree can grow 
extremely large. Therefore, strategies have been developed to prune parts of the 
search tree. In Constraint Programming, these strategies mainly consist of the 
simplification of the problem during the search for a solution. The techniques 
that are most widely applied are so-called consistency techniques that can reduce 
the domains of the variables, based on the constraints between them. Therefore, 
algorithms that achieve some state of consistency are also called filtering algo- 
rithms. 

This paper deals with consistency techniques or filtering algorithms that can 
be deduced from the alldif f erent constraint. It turns out that there exist 
different degrees of consistency, each degree allowing more or less values in the 
variable domain. In general it takes more time to obtain a stronger consistency 
than to obtain a weaker consistency. So with more effort, one could remove 
more values. Therefore, for each individual problem one has to make a trade- 
off between the effort (time) and the gain (domain shrinking) when choosing a 
particular consistency to achieve. 

1.1 Overview 

The different degrees of consistency will be defined in Section 0, together with 
some more preliminaries. Then each of the Sections || up to || will treat one 
consistency technique. These sections are ordered in increasing strongness of the 
considered consistency. The treatment consists of a description of the particular 
consistency with respect to the alldif f erent constraint, together with an algo- 
rithm that achieves this consistency. Finally a conclusion is given in Section |?]. 

2 Preliminaries 

A constraint satisfaction problem (CSP) is defined as a finite set of variables X — 
{x\, ...,i n }, with domains V = {D\, . . . , D n } associated with them, together 
with a finite set of constraints C, each on a subset of X. A CSP P will also be 
denoted as P = (X,T>,C). A constraint C 6 C is defined as a subset of the 
Cartesian product of the domains of the variables that are in C . For instance, 
C(xi,Xs,X4,) C D\ x D 3 x D 4 . An n-uple (di, . . . ,d n ) G D\ x ••• x D n is a 
solution to a CSP if for every constraint C G C on the variables Xi i: . . . , Xi m 
we have (d^, . . . , dj m ) G C. For finite, linearly ordered domains Di, we define 



minDi and maxD, to be the minimum value and the maximum value of the 
domain Di. 

We now introduce four notions of local consistency in the order they will 
be discussed in the text. Note the use of braces ({, }) and brackets ([, ]) that 
indicate a set and an interval of domain values respectively. 

Definition 1 (Arc consistency). A binary constraint C(x\, X2) where D\ and 
D2 are non-empty, is called arc consistent iff Vdi € D\ 3d2 G Di such that 
(di, d2) G C, and Vc?2 G D2 3di G D\ such that (di, d2) € C . 

Definition 2 (Bound consistency). An m-ary constraint C(x\, . . . , x m ) where 
no domain Di is empty, is called bound consistent iff for each variable Xi'. 
Vdi G {mm Di, max Di},yj G {1, . . . , m} — {i}, 3dj G [minD 3 , maxDj] such 
that (di, . . . , d m ) G C . 

Definition 3 (Range consistency). An m-ary constraint C(x\, . . . ,x m ) where 
no domain Di is empty, is called range consistent iff for each variable Xi : Vdj G 
Di,\/j G {1, . . . , to} — {i}, 3dj G [m'mDj, maxDj] such that (di, . . . , d m ) G C . 

Definition 4 (Hyper-arc consistency). An m-ary constraint C{x\, . . . ,x m ) 
where no domain Di is empty, is called hyper-arc consistent iff for each variable 
Xi: Vdi G Di,yj G {1, . . . , m} — {i}, 3dj G Dj such that (di, . . . , d m ) G C . 

In other words, both arc consistency and hyper-arc consistency check whether 
any value in every domain does belong to a feasible instance of the constraint, 
based on the domains. Range consistency however, does not check the feasibility 
of the constraint with respect to the domains, but with respect to intervals that 
include the domains. It can be regarded as a relaxation of hyper-arc consistency. 
Bound consistency can be regarded as a relaxation of range consistency. It does 
not even check all values in the domains, but only the minimum and the max- 
imum value, while still verifying the constraint with respect to intervals that 
include the domains. This is formalized in Proposition |l|. 

Definition 5 (Consistent CSP). A CSP is arc consistent if all its binary 
constraints are. A CSP is range consistent, respectively, bound consistent or 
hyper-arc consistent if all its constraints are. 

Consider a CSP P. If we apply to P an algorithm that achieves range con- 
sistency on P, we will denote the result as <Pr(P). Analogously, <Pb(P), $a(P) 
and <Pha{P) denote the achievement of bound consistency, arc consistency and 
hyper-arc consistency on P respectively. Let P@ denote a failed CSP, i.e. a CSP 
with at least one empty domain. We define a CSP P = (X,T>,C) smaller than 
a CSP P' = (X',V',C) if V C V. This relation is written as P < P' . A CSP 
P is strictly smaller than a CSP P', i.e. P -< P', when V C V and D z C D[ for 
at least one i. When both P < P' and P' ■< P we write P = P'. By convention, 
P$ is the smallest CSP. This notation is adopted from B. 



Proposition 1. <P H a{P) d $r(P) <$b{P)- 



Proof. Both hyper-arc consistency and range consistency verify all values of 
all domains. But hyper-arc consistency verifies the constraints with respect to 
the exact domains Di, while range consistency verifies the constraints with 
respect to intervals that include the domains: [minD^, maxDJ. A constraint 
that holds on a domain Di also holds on the interval [minD^, max Di) since 
D i C [minDj, maxD,]. The converse is not true, see Example 0. Hence <Pr(P) ^ 
$ha(P). 

Both range consistency and bound consistency verify the constraints with 
respect to intervals that include the domains. But bound consistency only con- 
siders minDi and maxZ^ for a domain Di, while range consistency considers all 
values in Di. Since {minZ?i, maxflj C Di, 4>b(P) ^ <Pr(P). Example ^ shows 
that 4>b(P) ~< @r{P) cannot be discarded. 

The following examples clarify Proposition ^. 

Example 2 (Comparing consistencies). Consider the following CSP: 

p= Ux G {l,3} ) n6{2} ) i 3 e{l,2 1 3}, 
1 alldif f erent(a!i, x%, X3). 

Then $ B (P) = P, while 
*a(P) = 

and <Pha{P) = $r(P). Next, consider the CSP 



xi G {1,3}, x 2 G {2},x 3 G {1,3}, 
alldif f erent(a;i, X2, X3). 



p , = f X! G {1, 3}, x 2 G {1, 3}, x 3 e {1, 3}, 
1 alldif ferent(a;i, x%, X3). 

This CSP is obviously inconsistent, since there are only two values available, 
namely 1 and 3, for three variables that must be pairwise different. $ha(P') 
will detect this inconsistency, while $r(P') = P'. 

A useful theorem to derive algorithms that ensure consistency for the all- 
different constraint is Hall's Theorem ||. The following formulation is stated 
in terms of the alldif ferent constraint. The cardinality of a set K is denoted 
by \K\. 

Theorem 1 (Hall). The constraint alldif ferent(a;i, . . . ,x n ) on the variables 
x\, . . . ,x n with respective domains D\, . . . , D n has a solution if and only if no 
subset K C {x\, . . . , x n } exists such that \K\ > | Li Xi ^K D%\- 

As an application of Theorem |l|, let us return to the CSP P' in Example [|. Take 
as subset K = {xi,X2,x 3 }, then \K\ = 3. Furthermore, |U xie ij-Z>i| = |{1, 3}| = 2. 
For this subset K, Hall's condition does not hold (3 > 2), hence this CSP has 
no solution. 



3 Local Consistency of a Decomposed CSP 



The standard filtering algorithm for the alldif f erent constraint is as follows. 
Whenever the domain of a variable contains only one value, remove this value 
from the domains of the other variables that occur in the alldif f erent con- 
straint. This procedure is repeated as long as possible. Although this algorithm 
might seem rather poor or naive, it has been successfully implemented in many 
constraint solvers, for instance in the system Chip p^ |. 

This filtering algorithm can also be described as follows. A common way to 
rewrite the alldif f erent constraint is to generate a sequence of disequalities. 
For instance 



alldif f erent(xi, x 2 , X3, £4) 



x 1 ^ x 2 , x 1 ^ x 3 , x 1 ^ Xi, 
x-i ^ x 3 , x 2 ^ x 4 , x 3 ^ x 4 . 



If we apply an algorithm that achieves arc consistency on this set of binary con- 
straints, we obtain the same filtering as described above. One of the drawbacks 
of this method is the quadratic increase of the number of constraints. One needs 
(2) = \{n 2 — n) disequalities to express an n-ary alldif f erent constraint. But 
an even more important drawback is the loss of information. When this set of 
binary constraints is being made arc consistent, only two variables are compared 
at a time. However, when the alldif f erent constraint is being made hyper-arc 
consistent, all variables are considered at the same time, which gives a much 
stronger consistency. This is shown in Proposition |^. Let Pdec denote the decom- 
posed CSP P in which all alldif f erent constraints have been replaced by a 
sequence of disequalities. 

Proposition 2. <P H a(P) r< @A{Pdec). 

Proof. Since the definition of arc consistency and hyper-arc consistency is equiv- 
alent for binary constraints, we only need to consider the filtering of the all- 
different constraint. Consider the constraint C: alldif f erent(zi, . . . , x n ) and 
the corresponding decomposition in terms of disequalities, denoted by Cdec- If a 
value di G Di is not arc consistent w.r.t. the set Cdec, then it is also not hyper-arc 
consistent w.r.t. C . Indeed, when di is not arc consistent w.r.t. Cdec, then we 
cannot find a dj G Dj for some variable Xj such that Xi ^ Xj. But then we also 
cannot find an n-uple (d\, ...,x n ) G C, since we cannot find a value dj G Dj 
such that di ^ dj. Therefore, $ha(P) d ^A(Pdec)- The converse is not true, as 
illustrated in Example 0. 

Example 3 (Hyper-arc and arc consistency compared). For some integer n > 3, 
consider the CSP's 



P = 



xi e {1, . . .,n- 1},. . . ,x n -i e {1, ... ,n - l},x n € {1, . . . ,n}, 
alldif ferent(xi, . . . , x n ) 



Pdec — 



xi G {1, ...,n- 1}, . . .,x n -i G {1, 



l},x n G {1, . . . ,n}, 



Now $ A {Pdec) = Pdec, while 

<p (Pi = I X1 e f 1 ' ' ' ' ' n ~ ' ' ' ' Xn ~ 1 e ' ' ' ' 71 ~ x ™ e 

HM ' \alldifferent(xi,...,a; n ). 

Our next goal is to find a consistency notion for the set of disequalities that is 
equivalent to the hyper-arc consistency notion for the alldif f erent constraint. 
Relational consistency can be used for this. 

Definition 6 (Relational (l,m) consistency, Q). A set of constraints S = 
{Ci, . . . ,C m } is relationally (1, m)-consistent iff all domain values d £ Di of 
variables appearing in S, appear in a solution to the m constraints, evaluated 
simultaneously. A CSP P = (A",2?,C) is relationally (1, to)- consistent iff every 
set of to constraints S C C is relationally (l,m)- consistent. 

Note that arc consistency is equivalent to (1, Inconsistency. 

Again, let P be the CSP that consists only of the alldif f erent constraint 
and a corresponding set of variables and domains. 

Proposition 3. <P H a(P) = <P R{1 ^ {n 2_ n))c (Pdec)- 

Proof. By construction we have that the alldif f erent constraint is equivalent 
to the simultaneous consideration of the sequence of corresponding disequalities. 
The number of disequalities is precisely ^(n 2 — n). If we consider only i(n 2 — n)— i 
disequalities simultaneously (1 < i < \(r? — n) — 1), there are i unconstrained 
relations between variables, and the corresponding variables could take the same 
value when a certain instantiation is considered. Therefore, we really need to take 
all \(v? — n) constraints into consideration, which corresponds to the relational 
(1, | (w 2 — n))-consistency. 

As suggested before, the pruning performance of <pA(Pdec) is rather poor. 
Moreover, the complexity is relatively high, namely around 0(n 2 ), whereas the 
hyper-arc consistency algorithms are around 0(dn 15 ), where d is the maximum 
cardinality of the domains and n is the number of variables involved Jl^JlSfl . 
Nevertheless, this filtering algorithm applies quite well to several problems, such 
as the n-queens problem (n < 200) ]T2|]I^ ]. 

Other work on the comparison of the alldif f erent constraints and the 
corresponding decomposition has for instance been done in pO| and H. 



4 Bound Consistency 

The notion of bound consistency for the alldifferent constraint was intro- 
duced by Puget 0. We summarize his method in this section. Puget uses Hall's 
Theorem to construct an algorithm that achieves bound consistency. 

Definition 7 (Hall interval). Given an interval I, let Kj be the set of vari- 
ables Xi such that Di C I. We say that I is a Hall interval iff |/| = \Ki\. 



Proposition 4 (Puget [17]). The constraint alldif f erent(a;i, . . . , x n ) where 
no domain Di is empty, is bound consistent iff 



— for each interval I: \Kj\ < \I\, 

— for each Hall interval I: {minZ).;, maxD^} fl I = for all Xi Kj. 

Proposition |J can be used to construct an algorithm that achieves bound 
consistency on the alldif ferent constraint. Indeed, we could check every in- 
terval / with bounds ranging from the minimum of all domains to the max- 
imum of all domains. When |/| < \Kj\, we know that the constraint is in- 
consistent. And for each Hall interval, we remove all mm Di and maxD, until 
{min D i, max Di} n I = 0. Puget gives an implementation with the time com- 
plexity of 0(n log n). 

In |l5|| , Mehlhorn and Thiel present an algorithm that achieves bound con- 
sistency of the alldif ferent constraint in time 0(n) plus the time required for 
sorting the interval endpoints. In particular, if the endpoints are from a range 
of size 0(n k ) for some constant k, the algorithm runs in linear time. 

Example 4- The following simple problem shows an application of the bound 
consistency algorithm based on intervals. 

p= [xi e {U},i 2 e {i,2},i 3 e {2,3}, 

1 alldif ferent(iri, X2, X3). 

Intuitively, observe that the variables x\ and X2 both have domain {1,2}. So 
these two variables together range over two values, and for a feasible instantiation 
they must be different. This means that the values 1 and 2 must be assigned 
to these two variables. Hence, values 1 and 2 cannot be assigned to any other 
variable and therefore, value 2 will be removed from the domain of X3 . 

The algorithm detects this when the interval / is set to I = {1, 2}. Then the 
number of variables for which Di C I is 2, namely x\ and x^- Since |/| — 2, / is a 
Hall interval. The domain of £3 is not in this interval, and {minl?3, maxZ)3}n/ = 
{minZ?3}. In order to obtain the empty set in the right hand side of the last 
equation, we need to remove minDi. The resulting CSP is bound consistent. 



5 Range Consistency 

An algorithm that achieves range consistency was introduced by Leconte fl2]| . 
We follow the same procedure as in the previous example. Leconte also uses 
Hall's Theorem to construct the algorithm. 

Definition 8 (Hall set). Given a set of variables K, let Ik be the interval 
[min_Di<-, max_Di<-], where Dk — <^K^i- We say that K is a Hall set iff 
\K\ = \Ik\. 

Note that in the above definition Ik does not necessarily need to be a Hall 
interval. 



Proposition 5 (Leconte [ 12 1 ) . The constraint alldif f erent(a;i, . . . , x n ) where 
no domain Di is empty, is range consistent iff for each Hall set K C {x\, . . . x n }: 
Di n I K = for all x t i K . 

We can deduce an algorithm from Proposition || in a similar way as we did 
for the algorithm for bound consistency. Leconte implemented an algorithm that 
achieves range consistency with a complexity of 0(n 2 d), where d is the average 
size of the domains. 

Observe that this algorithm is similar to the algorithm for bound consistency. 
Where the algorithm for bound consistency takes the domains as a starting point, 
the algorithm for range consistency takes the variables. But they both attempt 
to reach a situation in which the cardinality of a set of variables is equal to 
the cardinality of the union of the corresponding domains, as was illustrated in 
Example ||. 



6 Hyper-arc Consistency 

A filtering algorithm that achieves hyper-arc consistency for constraints of dif- 
ference was proposed by Regin Jl^j . A similar result was obtained independently 
by Costa g. Before we can introduce this algorithm, we have to establish a 
connection with the maximum matching problem in graph theory. The standard 



reference to matching theory is the book by Lovasz and Plummer 13 



6.1 Connections with Matching Theory 

Consider again the scheduling problem from Example |l|. To illustrate the prob- 
lem, assume that Krzysztof and Luca decided not to speak. We now want to 
model this problem graph-theoretically. First we introduce the definition of a 
bipartite graph. 

Definition 9 (Bipartite graph). A graph G consists of a finite non-empty 
set of elements V called nodes and a set of pairs of nodes E called edges. If the 
node set V can be partitioned into two disjoint non-empty sets X and Y such 
that all edges in E join a node from X to a node inY , we call G bipartite with 
bipartition (X,Y). We also write G = (X,Y,E). 

The remaining speakers from Example Rland their available times can be repre- 
sented by the bipartite graph in FigurcT|l|. Both speakers and time periods are 
represented by nodes, and these two sets of nodes are connected by edges, giving 
the bipartition (Speakers , Times). We call the constructed bipartite graph of an 
alldif ferent constraint C the value graph of C. Let Xq denote the variables 
occurring in a constraint C, with corresponding domains Dc- 

Definition 10 (Value graph). Given an alldif ferent constraint C, the bi- 
partite graph GV(C) — (Xc, Dc, E) where (xi,d) € E iff d S Di is called the 
value graph of C . 



Speakers 




Time Slots 



Fig. 1. The value graph for the revised speech scheduling problem 

Definition 11 (Maximum matching). A subset of edges in a graph G is 
called a matching if no two edges have a node in common. A matching of maxi- 
mum cardinality is called a maximum matching. A matching M covers a set X 
if every node in X is an endpoint of an edge in M . 

Note that a matching that covers the set of speakers in Figure |] is a maximum 
matching. The following theorem gives the link between a maximum matching 
in a bipartite graph and hyper-arc consistency of the alldif f erent constraint. 

Proposition 6 (Regin [p!8|). The constraint C : alldif f erent {x\, x n ) 
is hyper-arc consistent iff every edge in its value graph GV(C) belongs to a 
matching which covers Xq in GV(C). 



Speakers 




Time Slots 



Fig. 2. A maximum matching in the value graph 

An illustration of Proposition ^ is given in Figures || and ||. The fat lines in the 
graph of Figure ^ denote a maximum matching that covers all speaker nodes. Not 
all edges belong to such a matching, and by Proposition |^ they can be removed. 
When these edges are removed, the resulting alldif f erent constraint is hyper- 
arc consistent. This is depicted in Figure ||, which corresponds to Table ||. 



Speakers 




Time Slots 



Fig. 3. The value graph after filtering 
Table 2. Filtered time slots for the speakers 



Speaker 


Available 


Sebastian 
Frederic 
Jan-Georg 
Maarten 


{5,6} 
{3,4} 
{2,5} 
{3,4} 



6.2 An Algorithm for Achieving Hyper-arc Consistency 

An algorithm that achieves hyper-arc consistency for the alldif f erent con- 
straint should remove all those edges in the corresponding value graph that do 
not belong to a maximum matching. Berge has given a property that identifies 
exactly these edges pi. But first, we introduce some definitions we need for this 
property. 

Definition 12. Let M be a matching in a graph G — (V,E). An alternating 
path or alternating cycle is a path or a cycle whose edges are alternately in M 
and in E — M . The length of a path or a cycle is the number of edges it contains. 
A node is called free w.r.t. M if it is not incident to a matching edge. 

For instance, in Figure @, (3, F, 4, M, 3) is an even alternating cycle of length 4. 
Node 6 is a free node. 

Proposition 7 (Berge). An edge belongs to a maximum matching iff for some 
maximum matching, it belongs to either an even alternating path which begins 
at a free node, or to an even alternating cycle. 

With this property, we are able to identify and remove edges that are not in 
any maximum matching. Note that we need to construct a maximum matching 
before we can apply this property. The algorithm that achieves hyper-arc con- 
sistency is represented in Figure f|. To construct the value graph GV, we need 
0(d\Xc\ + \Xc\ + \Dc\) steps, where d is the maximum cardinality of a variable 
domain. The procedure ComputeMaximumMatching(GT^) computes a max- 
imum matching in the graph GV. This can be done for instance with a so-called 



Input: constraint of difference C, variables X and domains T> 
Output: false when no solution, otherwise true and updated domains 
begin 

1 Build GV = (Xc, Do, E) 

2 M(GV) <- ComputeMaximumMatching(GV / ) 

3 if |M(GV)| < \X C \ then return false 

4 RemoveEdgesFromG(GV, M{GV)) 

5 return true 
end 

Fig. 4. An algorithm for achieving hyper-arc consistency 

augmenting path algorithm. Hopcroft and Karp gave an implementation for this 
that runs in 0(^J\Xc\m) time, where m is the number of edges of GV JlCfl . 
Their algorithm still remains essentially the best known Q . 

From Hall's Theorem we already know that whenever we find a subset of 
nodes the cardinality of which exceeds the cardinality of the corresponding set 
of domain values, no matching exists that saturates Xc- This is checked in line 
3. In the procedure RemoveEdgesFromG(GV, M(GV)) the actual filtering 
takes place. Instead of applying Berge's property directly, we can translate the 
problem in such a way, that we have to search for the so-called strongly connected 
components of the graph pl| . For this problem we can use an implementation by 
Tarjan that runs in 0(n + m) time on graphs with n nodes and m edges 
In the algorithm from Figure ||, the search for a maximum matching remains the 
dominant factor, hence the total algorithm runs in 0(y/\Xc\m) time. 

The notion of hyper-arc consistency was introduced by Mohr and Masini 
| fl6| . They also give a general algorithm to achieve this notion. For an n-ary 
alldif f erent constraint, where the domain size of all variables is bounded by 
d, Di < d, the time complexity of the general algorithm is 0( ( d f n y ), whereas 
the time complexity of the above algorithm is 0(dnyfn). 

7 Conclusions and Future Work 

In this paper, an overview of several filtering techniques for the alldif f erent 
constraint has been given. A comparison of these different techniques has been 
made by means of corresponding notions of local consistency and algorithms to 
achieve them. 

However, there are other interesting articles related to this subject, that are 
not considered in this paper. For instance, Focacci et al. J?J use information from 
the alldif f erent constraint for a filtering technique based on reduced costs. 
Furthermore, in [is[ | Regin introduced the symmetric alldif f erent constraint, 
together with filtering algorithms for this constraint. Finally, Bartak considers 
a dynamic version of the alldif f erent constraint fit]. 
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